MegaMan X3 Decompression Routine Documented By Euclid 31/1/2006 ------ Some rants/notes: People always assume "OMG MMX3! this game uses a c4 chip, the c4 chip does the decompression blah blah blah" well I proved them WRONG! c4 chip is ONLY responsible for opengl-like functions (like rotation etc, maybe 3d stuff too but i haven't even played the game pass the title screen lol), the gfx are all there, compressed in a weird format. For those who just want the algorithm, read C code (And the bit above that for the location of those gfx pointers) For those who wants to see a "roughly-commented" version of the asm, scroll down. -------------------------- Decompression related Data all addresses are for non-headered roms. 0x37732 to 0x37B83 - gfx pointers, there's 221 of them (followed by lotsa FFs), 5 bytes, - first 3 bytes are 3 byte pointers - the next 2 values is the length C code example (it works, i've tried it, addresses used are for the title screen gfx.): #include unsigned char readBuf[0x2800]; unsigned char writeBuf[0x8000]; unsigned int length = 0x2600; void decompress() { unsigned int i = 0, j = 0; unsigned char codeByte = readBuf[i]; unsigned int codeByteBitTest = 7; unsigned int count = 0; i++; while (count < length) { if ((codeByte & (1 << codeByteBitTest)) != 0) { //read length unsigned char nextByte = readBuf[i]; unsigned int len = nextByte /4; unsigned int start_pos; unsigned int k, oldJ; i++; count += len; start_pos = (nextByte & 3) * 256 + readBuf[i]; i++; //copy bytes oldJ = j; for (k = 0; len != 0; k++) { writeBuf[j] = writeBuf[(oldJ - start_pos) + k]; j++; len--; } } else { //read one byte writeBuf[j] = readBuf[i]; i++; j++; count++; } if (codeByteBitTest == 0) { codeByte = readBuf[i]; i++; codeByteBitTest = 8; } codeByteBitTest--; } } int main() { FILE* fp = fopen("Mega_Man_X_3_(U).smc","rb"); FILE* wp = fopen("out.bin","w"); fseek(fp,0xE84B3,0); fread(readBuf,sizeof(unsigned char),0xA00,fp); fclose(fp); decompress(); fwrite(writeBuf,sizeof(unsigned char),0x2000,wp); fclose(wp); return 0; } decompression code, kinda commented: $F6 - 3 byte pointer (uses Y register for the low addresses, only bank value in $F8) $F9 - buffer write pointer value - 7Fxxxx $FB - 2 bytes length $FD - first code byte $FE - "which bit is being tested" value 08 = x--- ---- 07 = -x-- ---- ...etc $00/B74A BD 32 F7 LDA $F732,x[$06:F77D] ; load low byte of the ptr (--:xxxx) $00/B74D A8 TAY $00/B74E BD 33 F7 LDA $F733,x[$06:F77E] ; load high byte of the ptr (xx:----) $00/B751 85 F7 STA $F7 [$00:00F7] $00/B753 64 F6 STZ $F6 [$00:00F6] ; store 0 to the low bytes, using y as low byte. $00/B755 BD 35 F7 LDA $F735,x[$06:F780] ; load length of this gfx block $00/B758 85 FB STA $FB [$00:00FB] ; store length $00/B75A 8B PHB $00/B75B A6 F9 LDX $F9 [$00:00F9] ; load buffer write value $00/B75D F4 06 7F PEA $7F06 [$06:7F06] ; pushes 06 7F $00/B760 AB PLB ; change data bank to $06 (last value in the last push) $00/B761 08 PHP $00/B762 20 87 81 JSR $8187 [$00:8187] ; this routine does something to the stack... change bank perhaps? $00/B765 28 PLP $00/B766 AB PLB $00/B767 E2 20 SEP #$20 $00/B769 A9 08 LDA #$08 $00/B76B 85 FE STA $FE [$00:00FE] ; store new bit-test value $00/B76D B7 F6 LDA [$F6],y[$1C:9CA4] ; take first byte (control byte/code byte/whatever) $00/B76F 85 FD STA $FD [$00:00FD] ; $00/B771 C8 INY ; increase address $00/B772 D0 05 BNE $05 [$B779] ; if address is 0, switch to next bank $00/B774 A0 00 80 LDY #$8000 ; "switch to next bank code" $00/B777 E6 F8 INC $F8 [$00:00F8] ; Bit-test control byte code. $00/B779 06 FD ASL $FD [$00:00FD] ; if bit x of $FE (see above) on control byte is a 0 then jump $00/B77B 90 6C BCC $6C [$B7E9] ; ; case 1 on the code byte. $00/B77D B7 F6 LDA [$F6],y[$1C:9CA6] ; load next byte $00/B77F 4A LSR A ; $00/B780 4A LSR A ; $00/B781 85 00 STA $00 [$00:0000] ; make the top 6 bits as the counter. $00/B783 85 04 STA $04 [$00:0004] ; store extra copy of counter. $00/B785 B7 F6 LDA [$F6],y[$1C:9CA6] ; load same byte as before $00/B787 29 03 AND #$03 ; take bottom 2 bits 0000 00xx $00/B789 85 03 STA $03 [$00:0003] ; save as the upper byte of "write buffer relative start position to the current position" $00/B78B C8 INY ; increase address $00/B78C D0 05 BNE $05 [$B793] ; if address is 0, switch to next bank $00/B78E A0 00 80 LDY #$8000 ; "switch to next bank code" $00/B791 E6 F8 INC $F8 [$00:00F8] $00/B793 B7 F6 LDA [$F6],y[$1C:9CA7] ; load next byte $00/B795 85 02 STA $02 [$00:0002] ; save as the lower byte of "write buffer relative start position to the current position" $00/B797 C8 INY ; increase address $00/B798 D0 05 BNE $05 [$B79F] ; if address is 0, switch to next bank (not even going to include code argh) $00/B79F 5A PHY ; save read buffer pos $00/B7A0 C2 20 REP #$20 $00/B7A2 8A TXA ; take write buffer pos to A $00/B7A3 38 SEC $00/B7A4 E5 02 SBC $02 [$00:0002] ; subtract with $02 $00/B7A6 A8 TAY ; move it to the Y for now $00/B7A7 E2 20 SEP #$20 $00/B7A9 B9 00 00 LDA $0000,y[$7F:0000] ; load byte at Y $00/B7AC 9D 00 00 STA $0000,x[$7F:0001] ; write $00/B7AF E8 INX ; move onto the next reading byte from read buffer $00/B7B0 C8 INY ; move onto the next reading byte from write buffer $00/B7B1 C6 00 DEC $00 [$00:0000] ; decrease count $00/B7B3 D0 F4 BNE $F4 [$B7A9] ; loop $00/B7B5 7A PLY ; return y to the rom address read buffer. $00/B7B6 64 05 STZ $05 [$00:0005] ; 0 to $05 $00/B7B8 C2 20 REP #$20 $00/B7BA A5 FB LDA $FB [$00:00FB] ; load count $00/B7BC 38 SEC $00/B7BD E5 04 SBC $04 [$00:0004] ; subtract the counter from count (you've just read that many bytes) $00/B7BF 85 FB STA $FB [$00:00FB] $00/B7C1 48 PHA ; from here onwards it's not relevant to the decompression... $00/B7C2 8A TXA ; take x buffer $00/B7C3 38 SEC $00/B7C4 E5 F9 SBC $F9 [$00:00F9] ; subtract pointer value. $00/B7C6 18 CLC $00/B7C7 65 FB ADC $FB [$00:00FB] ; add length $00/B7C9 CF 00 D0 7F CMP $7FD000[$7F:D000] ; wtf another one of these useless stuff $00/B7CD F0 01 BEQ $01 [$B7D0] $00/B7CF EA NOP $00/B7D0 68 PLA ; pop the copy of count. $00/B7D1 E2 20 SEP #$20 $00/B7D3 F0 37 BEQ $37 [$B80C] ; if count is 0 goto the 0c $00/B7D5 10 01 BPL $01 [$B7D8] ; another one? $00/B7D7 EA NOP $00/B7D8 F4 06 7F PEA $7F06 [$7F:7F06] ; same code as above $00/B7DB AB PLB $00/B7DC 08 PHP $00/B7DD 20 87 81 JSR $8187 [$00:8187] $00/B7E0 28 PLP $00/B7E1 AB PLB $00/B7E2 C6 FE DEC $FE [$00:00FE] ; decrease bit-tested value $00/B7E4 D0 93 BNE $93 [$B779] ; if it's not 0 goto Bit-test control byte code. $00/B7E6 4C 5D B7 JMP $B75D [$7F:B75D] ; else goto the top ; case 0 on the code byte. $00/B7E9 B7 F6 LDA [$F6],y[$1C:9CA5] ; read next byte $00/B7EB 9D 00 00 STA $0000,x[$7F:0000] ; store into the gfx buffer $00/B7EE E8 INX $00/B7EF C8 INY $00/B7F0 D0 05 BNE $05 [$B7F7] ; if address is 0, switch to next bank $00/B7F2 A0 00 80 LDY #$8000 ; "switch to next bank code" $00/B7F5 E6 F8 INC $F8 [$00:00F8] $00/B7F7 C2 20 REP #$20 ; $00/B7F9 C6 FB DEC $FB [$00:00FB] ; decrease count $00/B7FB E2 20 SEP #$20 $00/B7FD F0 0D BEQ $0D [$B80C] ; if count is 0 ...jump to bottom, ie exit $00/B7FF 10 01 BPL $01 [$B802] ; if count is 1xxx xxxx... NOP? wtf? $00/B801 EA NOP $00/B802 C6 FE DEC $FE [$00:00FE] ; decrease the bit-test value $00/B804 F0 03 BEQ $03 [$B809] ; if $FE is not zero goto the bit-test control byte code $00/B806 4C 79 B7 JMP $B779 [$7F:B779] ; $FE is zero go back to the beginning, new control byte is needed. $00/B809 4C 5D B7 JMP $B75D [$7F:B75D] ; $00/B80C AB PLB ; $00/B80D E0 01 80 CPX #$8001 ; if write buffer position passes 7F8000 mark... $00/B810 B0 FE BCS $FE [$B810] ; --- infinite loop? (or wait for something?) $00/B812 86 F9 STX $F9 [$00:00F9] ; write buffer position value to $F9 $00/B814 64 F5 STZ $F5 [$00:00F5] ; write 0 to $F5? $00/B816 4C 5A 81 JMP $815A [$06:815A] ; jump to other code.... exit routine. routine $8187 --- NOT RELEVENT TO THE DECOMPRESSION! $00/8187 E2 20 SEP #$20 A:0A00 X:0000 Y:9CA4 P:envmxdIzc $00/8189 2C CE 09 BIT $09CE [$06:09CE] A:0A00 X:0000 Y:9CA4 P:envMxdIzc $00/818C 30 01 BMI $01 [$818F] A:0A00 X:0000 Y:9CA4 P:envMxdIZc $00/818E 60 RTS A:0A00 X:0000 Y:9CA4 P:envMxdIZc $00/818F DA PHX A:0827 X:01D9 Y:9E1D P:eNVMxdIzc $00/8190 5A PHY A:0827 X:01D9 Y:9E1D P:eNVMxdIzc $00/8191 08 PHP A:0827 X:01D9 Y:9E1D P:eNVMxdIzc $00/8192 C2 20 REP #$20 A:0827 X:01D9 Y:9E1D P:eNVMxdIzc $00/8194 E2 10 SEP #$10 A:0827 X:01D9 Y:9E1D P:eNVmxdIzc $00/8196 A6 A0 LDX $A0 [$00:00A0] A:0827 X:00D9 Y:001D P:eNVmXdIzc $00/8198 A9 02 01 LDA #$0102 A:0827 X:0060 Y:001D P:enVmXdIzc $00/819B 95 30 STA $30,x [$00:0090] A:0102 X:0060 Y:001D P:enVmXdIzc $00/819D 3B TSC A:0102 X:0060 Y:001D P:enVmXdIzc $00/819E 95 34 STA $34,x [$00:0094] A:02B5 X:0060 Y:001D P:enVmXdIzc $00/81A0 4C FB 80 JMP $80FB [$06:80FB] A:02B5 X:0060 Y:001D P:enVmXdIzc $00/8117 D6 31 DEC $31,x [$00:0031] A:0202 X:0000 Y:001D P:enVMXdIZC $00/8119 F0 30 BEQ $30 [$814B] A:0202 X:0000 Y:001D P:enVMXdIZC $00/814B 86 A0 STX $A0 [$00:00A0] A:0202 X:0000 Y:001D P:enVMXdIZC $00/814D A9 03 LDA #$03 A:0202 X:0000 Y:001D P:enVMXdIZC $00/814F 95 30 STA $30,x [$00:0030] A:0203 X:0000 Y:001D P:enVMXdIzC $00/8151 C2 30 REP #$30 A:0203 X:0000 Y:001D P:enVMXdIzC $00/8153 B5 34 LDA $34,x [$00:0034] A:0203 X:0000 Y:001D P:enVmxdIzC $00/8155 1B TCS A:0138 X:0000 Y:001D P:enVmxdIzC $00/8156 28 PLP A:0138 X:0000 Y:001D P:enVmxdIzC $00/8157 7A PLY A:0138 X:0000 Y:001D P:envmXdIZc $00/8158 FA PLX A:0138 X:0000 Y:0001 P:envmXdIzc $00/8159 60 RTS A:0138 X:0000 Y:0001 P:envmXdIZc ; other code.... not relevant to gfx. $00/815A E2 30 SEP #$30 A:0000 X:0A00 Y:A307 P:eNvMxdIzc $00/815C A6 A0 LDX $A0 [$00:00A0] A:0000 X:0000 Y:0007 P:eNvMXdIzc $00/815E 74 30 STZ $30,x [$00:0090] A:0000 X:0060 Y:0007 P:envMXdIzc $00/8160 80 B9 BRA $B9 [$811B] A:0000 X:0060 Y:0007 P:envMXdIzc $00/8128 9C CE 09 STZ $09CE [$06:09CE] A:0070 X:0070 Y:0007 P:envMXdIZC $00/812B 4C FB 80 JMP $80FB [$06:80FB] A:0070 X:0070 Y:0007 P:envMXdIZC